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Abstract 

Ranks of subspaces of vector spaces satisfy all linear inequalities satisfied by entropies (in- 
cluding the standard Shannon inequalities) and an additional inequality due to Ingleton. It is 
known that the Shannon and Ingleton inequalities generate all such linear rank inequalities on 
up to four variables, but it has been an open question whether additional inequalities hold for 
the case of five or more variables. Here we give a list of 24 inequalities which, together with 
the Shannon and Ingleton inequalities, generate all linear rank inequalities on five variables. 
We also give a partial list of linear rank inequalities on six variables and general results which 
produce such inequalities on an arbitrary number of variables; we prove that there are essen- 
tially new inequalities at each number of variables beyond four (a result also proved recently 
by Kinser). 
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1 Introduction 

It is well-known that the linear inequalities always satisfied by ranks of subspaces of a vector space 
(referred to here as linear rank inequalities) are closely related to the linear inequalities satisfied by 
entropies of jointly distributed random variables (often referred to as information inequalities). For 
background material on this relationship and other topics used here, a useful source is Hammer, 
Romashchenko, Shen, and Vereshchagin [9]. 

The present paper is about linear rank inequalities; nonetheless, the basic results from infor- 
mation theory will be useful enough that we choose to use the notation of information theory here. 
We use the following common definitions: 



There are two interpretations of these equations. When A, B, and C are random variables, 
A, B denotes the joint random variable combining A and B; H(A) is the entropy of A; H(A\B) 
is the entropy of A given B; I{A; B) is the mutual information of A and B\ and I{A] B\C) is the 
mutual information of A and B given C. 

But when A, B, and C denote subspaces of a vector space, then A, B denotes the space spanned 
by A and B, which is {A, B) or, since A and B are subspaces, just A + B; H{A) is the rank of 
A; H{A\B) is the excess of the rank of A over that of AnB; I {A; B) is the rank oi An B; and 
I [A] B\C) is the excess of the rank of (A + C) fl (S + C) over that of C. In either interpretation, 
the equations above are valid. 

The basic Shannon inequalities state that I{A] B\C) (as well as the reduced forms I{A] B), 
H{A\B), and H{A)) is nonnegative for any random variables A, B, C. Any nonnegative linear 
combination of basic Shannon inequalities is called a Shannon inequality. We will use standard 
Shannon computations such as I{A] B\C) = I{A; B, C) — I{A; C) (one can check this by ex- 
panding into basic H terms) and H{A\C) > H{A\B, C) (because the difference is I{A; B\C)) 
throughout this paper; an excellent source for background material on this is Yeung [|T5l. 

A key well-known fact is that all information inequalities (and in particular the Shannon in- 
equalities) are also linear rank inequalities for finite-dimensional vector spaces. To see this, first 
note that in the case of a finite vector space V over a finite field F, each subspace can be turned into 
a random variable so that the entropy of the random variable is the same (up to a constant factor) 
as the rank of the subspace: let X be a random variable ranging uniformly over V* (the set of 
linear functions from V to F), and to each subspace AofV associate the random variable X \ A. 
The entropy of this random variable will be the rank of A, if entropy logarithms are taken to base 
For the infinite case, one can use the theorem of Rado [T4] that any representable matroid is 
representable over a finite field, and hence any configuration of finite-rank vector spaces over any 
field has a corresponding configuration over some finite field. 

The converse is not true; there are linear rank inequalities which are not information inequal- 
ities. The first such example is the Ingleton inequality, which in terms of basic ranks or joint 



H{A\B) 
liA; B) 
I{A-B\C) 



H{A,B) - H{B) 

H{A) + H{B)-H{A, B) 

H{A, C) + H{B, C) - H{A, B, C) - H{C) 
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entropies is 

H{A) + H{B) + H{C, D) + H{A, B, C) + H{A, B, D) 

< H{A, B) + H{A, C) + H{B, C) + H{A, D) + H{B, D), 

but which can be written more succinctly using the / notation as 

liA; B) < I{A- B\C) + /(A; B\D) + /(C; D). 

Ingleton IfTOl proved this inequality and asked whether there are still further independent inequali- 
ties of this kind. 

A key tool used by Hammer et al. [9] is the notion of common information. A random variable 
Z is a common information of random variables A and B if it satisfies the following conditions: 
H{Z\A) = 0, H{Z\B) = 0, and H{Z) = I{A; B). In other words, Z encapsulates the mutual 
information of A and B. In general, two random variables A and B might not have a common 
information. But in the context of vector spaces (or the random variables coming from them), 
common informations always exist; if A and B are subspaces of a vector space, one can just let Z 
be the intersection of A and B, and Z will have the desired properties. 

Hammer et al. [9| showed that the Ingleton inequality (and its permuted-variable forms) and the 
Shannon inequalities fully characterize the cone of linearly representable entropy vectors on four 
random variables (i.e., there are no more linear rank inequalities to be found on four variables). 



2 New five-variable inequalities 

We will answer Ingleton's question here. Using the existence of common informations, one can 
prove the following twenty-four new linear rank inequalities on five variables (this is a complete 
and irreducible list, as will be explained below). 

I{A- B) < I{A- B\C) + I{A- B\D) + /(C; D\E) + I{A- E) (1) 
I{A- B) < I{A- B\C) + I{A- C\D) + I{A- D\E) + I{B- E) (2) 
I{A- B) < I{A- C) + I{A- B\D) + I{B- E\C) + I{A- D\C, E) (3) 
I{A- B) < I{A; C) + I{A; B\D, E) + D\C) + /(A; E\C, D) (4) 
I{A- B) < I{A; C) + I{B- D\C) + I{A- E\D) 

+ I{A;B\C,E) + I{B-C\D,E) (5) 
I{A- B) < I{A- C) + I{B- D\E) + I{D- E\C) 

+ I{A;B\C,D) + I{A;C\D,E) (6) 
I{A; B) < I{A- C\D) + I{A- E\C) + I{B- D) 

+ I{B-D\C,E) + I{A;B\D,E) (7) 
2I{A; B) < I{A; B\C) + I{A- B\D) + /(A; B\E) 

+ I{C;D) + I{C,D;E) (8) 
2I{A- B) < I{A- C) + I{A- B\D) + I{A- B\E) 

+ I{D-E) + I{B-D.,E\C) (9) 
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2/(A; B) < I{A; B\C) + I{A; B\D) + J(C; D) + /(A; E) 

+ I{B-D\E) + I{A;C\D,E) (10) 
I{A; B, C) < I{A; C\B, D) + /(A; C, £;) + /(A; ^ID, ^) + /(S; D\C,E) (11) 
J(A; 5, C) < I{A- C) + /(A; B\D) + /(A; D|£;) + /(S; E|C) 

+ /(A;C|5,^) + /(C;E|5,D) (12) 
/(A; B, C) < I{A; B\D) + /(A; C, E) + D|C, £;) 

+ /(A;C|5,^) + /(C;E|5,D) (13) 
I{A- B, C) < I{A; D) + I{B; E\D) + /(A; 5|C, E) 

+ /(A;C|5,D) + /(A;C|D,^) (14) 
I{A- B, C) < I{A; D) + I{B- E\D) + I{A- C\E) + /(A; D) 

+ I{A;C\B,D) + I{B;D\C,E) (15) 
J(A; 5, C) < /(A; 5|C, D) + I{A; C\B, D) + /(S, C; D|^) 

+ /(5;C|D,E) + /(A;^) (16) 
J(A 5; C, D) < /(A, 5; D) + /(A; D\B, C) + C) + /(A; E) 

+ I{B- C\A, E) + I{A- B\D, E) + /(C; E\D) (17) 
J(A; 5) + I{A- C) < I{B- C) + I{A- B\D) + /(A; C\D) + D|^) 

+ /(C;D|£;) + /(A;E) (18) 
I{A- B) + /(A; C) < I{B- D) + 2I{A- C\D) + /(A; fi|£;) + I{D- E) 

+ I{B;E\C,D) + I{C;D\B,E) (19) 
I{A; B) + /(A; C) < I{B- C) + D) + /(A; C|D) + /(A; 5|^) 

+ /(A; ^|S) + I{C- D\E) + /(S; E\C, D) (20) 
/(A; 5) + I{A- C) < I{B- D) + I{A- C\D) + I{A- D\E) + /(C; £;) 

+ I{A- B\C, E) + I{B- C\D, E) + E\C, D) (21) 
2/(A; 5) + I{A- C) < I{A- B\C) + I{A- B\D) + /(C; /)) + /(A; C|E) 

+ I{A- D\E) + 2/(5; E) + C|D, E) + /(C; D) (22) 
I{A- B) + /(A; 5, C) < I{A- B\D) + 2I{A- C\E) + /;) + I{D- E) 

+ /(A; B\C, D) + 2/(S; D|C, E) + /(C; D) (23) 
I{A- C, D) + I{B- C, D) < D) + C|E) + /(C; E|D) + /(A; /;) + J(A; D) 

+ I{A, B- D\C) + /(A; D\B, E) + /(A; E) (24) 

(Note that there is much more variety of form in these inequalities than there is in the four- 
variable non-Shannon-type inequalities from [5J.) 

Each of these inequalities is provable from the Shannon inequalities if we assume that each 
mutual information on the left-hand side of the inequality is in fact realized by a common infor- 
mation. (Hence, since such common informations always exist in the linear case, the inequalities 
are all linear rank inequalities.) For instance, inequalities (I)-(IO) all hold if we assume that there 
is a random variable Z such that H{Z\A) = H{Z\B) = and H{Z) = I{A;B); inequality 
(23) holds if there exist random variables Z and Y such that H{Z\A) = H{Z\B) = H{Y\A) = 
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H(Y\B, C) = 0, H{Z) = I{A; B), and H{Y) = I{A; B, C); and so on. These assertions can all 
be verified using the program I TIP [16]. In fact, all of these become Shannon inequalities if we 
replace the left-hand mutual information(s) with terms H(Z) or H{Y) and add to the right-hand 
side appropriate terms like kH(Z\A) + kH(Z\B) for a sufficiently large coefficient k (k = 5 
suffices for all of these inequalities). For example, for inequality (1), one can show that 

H{Z) < I{A; B\C) + I{A; B\D) + /(C; D\E) + /(A; E) + bH{Z\A) + 5H{Z\B) 

is a Shannon inequality; if we set Z to be a common information for A and B, we get inequality (1). 
Again the verifications of these Shannon inequalities can be performed using I TIP, or one can 
work them out explicitly. In Section[3]we will present various alternate proof techniques. 

These inequalities can be written in other equivalent forms. 

Obvious rewrites (move the first term on the right to the left): 

J(A; B\C) < liA; B\D) + /(A; D\E) + E\C) 

+ I{A;C\B,E) + I{C;E\B,D) (12a) 
I{A, B; C\D) < I{A; D\B,C) + I{B; D\A, C) + I{A; C\B, E) 

+ I{B- C\A, E) + I{A- B\D, E) + /(C; E\D) (17a) 
I{A- C, D) + I{B; C\D) < I{B; C\E) + I{C; E\D) + I{A- E) + I{A- C\B, D) 

+ /(A, B; D\C) + I{A- D\B, E) + I{A- B\D, E) (24a) 

Obvious rewrites (enlarge terms on the left so they can be combined): 

2I{A- B, C) < I{A; C\B) + I{A; B\C) + I{B; C) + I{A; B\D) + /(A; C\D) 

+ I{B;D\E) + I{C;D\E) + I{A-E) (18b) 
2I{A; B, C) < I{A; C\B) + I{A; B\C) + I{B; D) + 2/(A; C\D) + I{A- B\E) 

+ I{D- E) + I{B- E\C, D) + I{C; D\B, E) (19b) 
2I{A- B, C) < I{A- C\B) + I{A- B\C) + I{B- C) + I{B- D) + /(A; C\D) 

+ I{A- B\E) + I{A- E\B) + I{C- D\E) + I{B- E\C, D) (20b) 
2I{A- B, C) < I{A; C\B) + I{A; B\C) + I{B; D) + /(A; C\D) + /(A; D\E) 

+ /(C; E) + I{A- B\C, E) + /(B; C\D, E) + E\C, D) (21b) 
3/(A; B, C) < 2/(A; C|S) + 2/(A; S|C) + /(A; B\D) + /(C; Z^) + /(A; C|E) 

+ /(A; Z}|E) + 2I{B- E) + /(S; C\D, E) + /(C; D) (22b) 
2/(A; S, C) < I{A- C\B) + /(A; B\D) + 2/(A; C|^) + /(S; E) + /(D; £;) 

+ I{A- B\C, D) + 2I{B- D\C, E) + /(C; E\B, D) (23b) 
2/(A, C, D) < C, D\A) + /(A; C, D|S) + /(S; D) + /(S; C\E) 

+ /(C; E|D) + I{A- E) + /(A; C|5, D) + /(A, 5; D|C) 

+ I{A-D\B,E) + I{A-B\D,E) (24b) 
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Non-obvious rewrites: 

I{A; C) < I{A; C\B) + I{A; B\D) + /(C; D\E) + I{A- E) (Ic) 
I{A-B\C) < I{A; E\C) + I{A; C\B, D) + I{A; B\D, E) + I{B; D\C, E) (11c) 
I{A; B\C) < I{A; B\D) + /(A; E\C) + 

+ I{A-C\B,E) + I{C-E\B,D) (13c) 
C|D) < I{B; C\A, D) + /(A; D\B, C) + £;|D) 

+ I{A-C\E) + I{B-D\C,E) (15c) 
J(S; C) < I{B- D) + /(A; C|D) + /(C; D|A) 

+ /(S; E\A) + C|Z}, E) + I{D- E\B, C) (19c) 
I{C; D\E) < I{A; D\E) + /(C; D|A) + /(S; D\C, E) 

+ I{B;C,E\A) + I{C;E\B,D) (21c) 
2/(A; C, D) < I{A- D\C) + I{C; D\A) + /(A; C\B) 

+ /(A; D|fi) + I{A- C\E) + /(A; D|^) 

+ 2/(5; /;) + C\D, E) + /(C; /;|5, D) (22c) 
D|/;) < I{B- D\A) + /(A; C|E) + /(C; E|A) + /(S; D\A, C) 

+ /(D; /;|5, C) + /(S; D) + I{B- D\C,E) (23c) 
/(A, E; D) < I{B; D) + I{C; E\B) + /(D; E\C) + /(A; 5|C, D) 

+ I{A] D\B, C) + I{A; D\B, E) + I{A; E\B, D) (24c) 

Note that, for these variant forms, we do not make the claim that the inequality follows from 
the existence of common informations corresponding to the left-hand- side terms. For instance, 
inequality (19c) does not follow from the Shannon inequalities and the existence of a common 
information for B and C. It turns out that inequality (24b) is provable from existence of a common 
information for {A, B) and (C, D), and inequalities (19b), (21b), (22b), and (23b) are provable 
from existence of a common information for A and (S, C), but inequalities (18b) and (20b) are 
not; in fact, no single common information (together with the Shannon inequalities) suffices to 
prove (18) or (20). 

3 Alternate proofs and generalizations 

In this section we will provide some alternate proof techniques for the inequalities. This will lead 
to natural generalizations. 

Lemma 1. The inequality H{Z\R) + I{R; S\T) > I{Z] S\T) is a Shannon inequality. 
Proof. Using Shannon inequalities, we see that 

H{Z\R) + H{S\Z, T) > H{Z\R, T) + H{S\Z, T) 

> I{S;Z\R,T) + H{S\Z,T) 

> I{S;Z\R,T) + H{S\R, Z,T) 
= H{S\R,T). 
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So H{Z\R) - H{S\R, T) > -H{S\Z, T); add H{S\T) to both sides to get the desired result. 
Corollary 2. IfH{Z\R) = 0, then I{R; S\T) > I{Z; S\T). 

Proof of the Ingleton inequality. Let Z he a common information of A and B, so that H{Z\A) 
H{Z\B) = and H{Z) = I {A; B). Then 

I{A;B\C) + I{A;B\D) + I{C- D) 



> I{Z- B\C) + I{Z- B\D) + I{C- D) 

> I{Z- Z\C) + I{Z- Z\D) + /(C; D) 
= H{Z\C) + H{Z\D) + I{C- D) 

> H{Z\C) + I{Z-C) 

> I{Z; Z) 
= H{Z) 

= I{A-B). 



[from Corollary [2] using H{Z\A) = 0] 
[from Corollary [2] using H{Z\B) = 0] 

[from Lemma [1] 
[from Lemma [T] 



This is essentially the proof given in Hammer et al. [|9]|. 
Proof of inequality (1). Let Z he a common information of A and B; then 
/(A; B\C) + liA; B\D) + /(C; D\E) + I{A- E) 

> I{Z; Z\C) + I{Z; Z\D) + I{C; D\E) + /(Z; E) [from Corollary [21 five times] 
= H{Z\C) + H{Z\D) + /(C; D\E) + I{Z- E) 

> I{Z; Z\E) + I{Z; E) [from Lemma [U twice] 
= H{Z\E) + I{Z- E) 

= H{Z) 
= I{A- B). 



Proof of inequality (2). Let Z he a common information of A and B; then 

liA; B\C) + liA; C\D) + I{A; D\E) + /(S; E) 

> I{Z- Z\C) + I{Z- C\D) + I{Z- D\E) + /(Z; E) 
= H{Z\C) + /(Z; C\D) + /(Z; + /(Z; £;) 

> /(Z; + I{Z- D\E) + /(Z; E) 
= H{Z\D) + I{Z- D\E) + I{Z- E) 
>I{Z-Z\E) + I{Z- E) 
= H{Z\E) + I{Z-E) 
= H{Z) 
= I{A;B). 



[from Corollary [2] 
[from Lemma [T] 
[from Lemma [T] 
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The same pattern allows us to prove more general inequalities: if Aq and Bq have a common 
information, then: 

/(Ao;5o) < /(Ao;5o|Si) 
+ I{Ao;B,\B2) 
+ ■■■ 

+ HAq] Bn-l\B„) 

+ HBo; Bn) (25) 
/(Ao; Bo) < 2"-i/(Ao; Bo\A,) + 2"-i/(Ao; Bo\Bi) 

+ 2"-2/(Ai; BM2) + 2"-2/(A, ; B.lB^) 
+ ••• 

+ liAn-U Bn-l\An) + /(A„_i; i?„_i|S„) 

+ /(A„;5„) (26) 

(Note that (|26l) is related to results in Makarychev and Makarychev lfT2ll .) These can be generalized 
further; for instance, in the right hand side of (|25l) any number of Aq's may be replaced by Bq's 
and/or vice versa. 
In fact: 

Theorem 3. Suppose we have a finite binary tree where the root is labeled with an information 
term I{x] y) and each other node is labeled with a term I{x; y\z). These terms may involve any 
variables. We single out two variables or combinations of variables, called A and B. Suppose 
that, for each node of the tree, if its label is /(x; y\z) [we allow z to be empty at the root], then: 

(a) X is A or B and there is no left child, or 

(6) there is a left child and it is labeled I{r] s\x) for some r and s; 

and 

(a') y is Aor B and there is no right child, or 

{b') there is a right child and it is labeled I{r'; s'\y) for some r', s'. 

Then the inequality 

I{A; B) < sum of all the node labels in the tree (27) 
is a linear rank inequality ( in fact, it is true whenever A and B have a common information ). 

Proof Let Z he a new variable. We prove by induction in the tree (from the leaves toward the 
root) that, for each node n, if T„ is the subtree rooted at n, and the node label at n is /(r; s\t), then 
we have as a Shannon inequality 

H{Z\t) < sum of node labels in r„ + jnH{Z\A) + knH{Z\B) (28) 

for some j„, A;„ > 0. (The inductive step uses Lemma[TJ) Applying this when n is the root and Z 
is a common information of A and B gives the desired result. ■ 
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We get the Ingleton inequality and inequalities (1) and (2) by applying this to the trees: 

Ingleton: I{C; D) 

/ \ 
I{A-B\C) I{A-B\D) 



(1): I{A-E) 

\ 



i(a,D\E) 
/ \ 

I{A;B\C) I{A;B\D) 



(2): I{B; E) 

\ 



I{A;D\E) 
\ 

I(A;C\D) 
\ 

I{AB\C) 



A longer "linear" tree like the last one gives (|25l ). while a complete binary tree of height n gives 

Here is another version of Theorem [3l 

Theorem 4. Let I{xi]yi\wi), I{x2;y2\w2), I{xm',ym\wm) be a list of information terms, 
where each Xi, yi, Wi is chosen from the list A, B,ri,r2, . . . , with the exception that wi is empty 
(i.e., the first information term is just I{xi] yi)). Suppose that each of the variables rj is used 
exactly twice, once as a Wi and once as an Xi or y^; while variables A and B may be used as many 
times as desired as an Xi or y^, but are not used as a Wi. Then the inequality 



i{A-B)<Y,n 



i=l 



is a linear rank inequality ( in fact, it is true whenever A and B have a common information ). 

Proof. We build a tree for use in Theorem |3l Each node will be labeled with one of the terms 
I{xi; yi\wi). The root is labeled /(xi; yi). If we have a node /(x^; yi\wi) where Xi is not A or B, 
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then create a left child for this node and label it I{xj; yj\wj) for the unique j such that Wj = Xj. 
Similarly, if yi is not A or B, then create a right child for this node and label it I{xj; yj\wj) for 
the unique j such that Wj = yi. It is easy to show that no term I{xi; yi\wi) will be used more than 
once in this construction (look for the counterexample nearest the root). Hence, the constuction 
will terminate, and the sum of the labels used is less than or equal to I{xi; yi\wi) (it does not 
matter if some of the terms I{xi; yi\wi) are not used as labels). Now Theorem [3] gives the desired 
result. ■ 

Theorem |4] directly gives the Ingleton inequality and inequalities (1) and (2). It also gives a 
number of the other listed inequalities once we write them in an equivalent form using equations 
such as I{A; B\C) = I{A; B, C\C): 

I{A; B) < I{A; C) + I{A; B\D) + /(S; C, E\C) + /(A; D\C, E) (3d) 
I{A- B) < I{A; C) + I{A; B\D, E) + I{B- C, D\C) + I{A- D, E\C, D) (4d) 
I{A- B) < I{A; C) + I{B- D\C) + I{A- D, E\D) 

+ I{A;B\C,E) + I{B;C,E\D,E) (5d) 
liA; B) < I{A; C\D) + I{A; C, E\C) + I{B; D) 

+ I{B;D,E\C,E) + I{A;B\D,E) (7d) 
I{A; B, C) < I{A; B, C\B, D) + /(A; C, E) + /(A; B, D\D, E) 

+ I{B,C;D,E\C,E) (lid) 
I{A; B, C) < I{A; C) + I{A; B, D\D) + I{A- D\E) 

+ I{B, C; E\C) + I{A; B, C\B, E) + I{B, C; B, E\B, D) (12d) 
I{A; B, C) < I{A; B, D\D) + /(A; C, E) + /(S, C; D\C, E) 

+ I{A;B,C\B,E) + I{C;B,E\B,D) (13d) 
I{A; B, C) < I{A; D) + /(S, D; D, E\D) + /(A; B, C\C, E) 

+ I{A;B,C\B,D) + I{A;C,E\D,E) (14d) 
I{A; B, C) < I{A; D) + I{B, D; E\D) + I{A- C, E\E) 

+ I{A- B, C\C, D) + I{A- B, C\B, D) + /(S, C; C, D\C, E) (15d) 
I{A- B, C) < I{A; B, C\C, D) + I{A- B, C\B, D) + I{B, C; D, E\E) 

+ I{B,D]C.,D\D,E) + I{A]E) (16d) 
J(A, B- C, D) < I{A, B; D) + /(A, 5; C, D\B, C) + /(A, B- C, D\A, C) 

+ /(A, B- B, C\B, E) + I{A, B; A, C\A, E) + /(A, E- B, E\D, E) 

+ I{C,D;D,E\D) (17d) 

For instance, inequality (5d) is obtained from Theorem |4] using the list of random variables 

A,B,C,D,{C,E),{D,E). 
Another approach is to prove the inequality 

I{A- B) < I{A- C) + I{B- D\C) + I{A- F\D) + I{A- B\E) + I{B- E\F) 
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directly from Theorem |4] and then apply the variable substitution 

(A, B, C, D, E, F) ^ {A, B, C, D, {C, E), {D, E)) 

to get (5d). Similarly, the other inequalities listed above are substitution instances of linear- variable 
inequalities on five to eight variables. (Note that (3d), (4d), and (lid) are substitution instances of 
(Ic).) 

We will now generalize Theorem [3] so as to generate additional inequalities. One easy but 
apparently useless generalization is to replace the binary tree with a binary forest (a finite disjoint 
union of binary trees). Then the hypotheses of Theorem[3]can be stated just as before (with "the 
root" replaced by "each root"); and the conclusion is the same except that the inequality becomes 

ml [A] B) < sum of all the node labels in the trees (29) 

where m is the number of trees (eqivalently, the number of root nodes). 

This modification alone is useless because the resulting inequality is just a sum of Theorem [3] 
inequalities, one for each tree. But it will become useful when combined with another modification. 
For this we need a tightening of Lemma [1} 

Lemma 5. The inequality H{Z\R) + I{R; S\T) > I{Z; S\T) + H{Z\R, S,T) is a Shannon in- 
equality. 

Proof. The proof is just as for LemmafU with the slack made explicit in one step. Using Shannon 
inequalities, we see that 

H{Z\R) + H{S\Z, T) > H{Z\R, T) + H{S\Z, T) 

= H{Z\R, S, T) + I{S; Z\R, T) + H{S\Z, T) 
> H{Z\R, S, T) + I{S; Z\R, T) + H{S\R, Z, T) 
= H{Z\R, S, T) + H{S\R,T). 

So H{Z\R) - H{S\R,T) > H{Z\R,S,T) - H{S\Z,T); add H{S\T) to both sides to get the 
desired result. ■ 

Using this twice (and noting that /(Z; Z\T) = H{Z\T) and H{Z\Z, S, T) = 0), we get 

H{Z\R) + H{Z\S) + I{R] S\T) > H{Z\T) + H{Z\R, S, T). (30) 
The case where T is a null variable gives 

H{Z\R) + H{Z\S) + /(i?; 5*) > H{Z) + H{Z\R, S). (31) 
These give us additional options in proving inequalities, as shown below. 
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Proof of inequality (8). Let Z be a common information of A and B; then 

I{A; B\C) + I{A; B\D) + I{A- B\E) + /(C; D) + /(C, 

> /(Z; Z\C) + /(Z; + I{Z- Z\E) + /(C; D) + /(C, D; £;) [from Corollary|2] 
= H{Z\C) + + H{Z\E) + /(C; D) + /(C, Z^; ^) 

> H{Z) + + H{Z\E) + /(C, D; £;) [from (EB] 
>H{Z) + H{Z) + H{Z\C,D,E) [from (EB] 

> 2i/(Z) 

= 2/(A;i?). 



This proof immediately generalizes to give: If A and B have a common information, then 

{n - 1)I{A- B) < I{A- B\Ci) + I{A- EjCs) + . . . I{A- S|C„) + 

+ [/(Ci; C2) + /(C1C2; C3) + ■ ■ • + /(C1C2 . . . C„_i; (32) 

The expression in brackets is actually symmetric in Ci, C2, . . . , C„; it is equal to 

H{C,) + H{C2) + ■■■ + H{C^) - H{C,C2 . . . C„). 

One can use Lemma[5]to produce an extended form of Theorem [3] in which an additional option 
is available: instead of having a left child, a node can have a left pointer pointing to some other 
node anywhere in the tree or forest, and similarly on the right side. 

Theorem 6. Suppose we have a finite binary forest where each node is labeled with an information 
term I{x; y\z), where z is empty at each root node (i.e., the root labels are of the form I{x; y)). 
These terms may involve any variables. We single out two variables or combinations of variables, 
called A and B. Suppose that, for each node of the forest, if its label is I{x; y\z) [with z possibly 
empty], then: 

(a) X is A or B and there is no left child, or 

(6) there is a left child of this node and it is labeled I{r; s\x) for some r, s, or 

(c) there is a left pointer at this node pointing to some other node whose label is I{r'; s'\t') 

where x = (r', s', t'); 

and 

(a') y is Aor B and there is no right child, or 

ip') there is a right child of this node and it is labeled I{r'; s'\y) for some r', s', or 
(c') there is a right pointer at this node pointing to some other node whose label is I{r'; s'\t') 
where y = (r', s', t'). 

Suppose further that no node is the destination of more than one pointer. Let m be the number of 
trees in the forest (equivalently, the number of root nodes). Then the inequality 

mI{A; B) < sum of all the node labels in the trees (33) 

is a linear rank inequality ( in fact, it is true whenever A and B have a common information ). 
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Proof. As with Theorem [3l let Z he, di new variable. For any left or right pointer, if /(r; s\t) 
is the label at the destination of the pointer, we say that the term associated with the pointer is 
H{Z\r, s, t). We prove by induction in the forest (upward from the leaves toward the roots) that, 
for each node n, if T„ is the subtree rooted at n, and the node label at n is /(r; s\t), then we have 
as a Shannon inequality 

H{Z\t) < sum of node labels in T„ + Out„ - In„ + jnH{Z\A) + knH{Z\B) (34) 

for some j„, fcn > 0, where Out„ is the sum of the terms associated with pointers /rom nodes in T„ 
and In„ is the sum of the terms asasociated with pointers to nodes in Tn- (A pointer whose source 
and destination are both in will contribute to both sums, but these contributions will cancel each 
other out.) The inductive step uses Lemma [51 the new term in that lemma is used to handle the 
case where there is a pointer with destination n (note that, by assumption, there is at most one such 
pointer). Once (l34l) is proved, apply it to all of the root nodes and add the resulting inequalities 
together to get 

mH{Z) < sum of all the node labels in the trees + ]H{Z\A) + kH{Z\B) (35) 

for some j, A; > 0; the pointer sums cancel out because each pointer contributes to one Out sum 
and one In sum. Applying (|35l) when Z is a common information of A and B gives the desired 
resuk (133]). ■ 

Theorem[6]can be used to prove inequalities (8) and (9) using the following diagrams (pointers 
are represented as dashed curves): 



(8): 



I{C;D) 

/ \ 

I{A;B\C) I{A-B\D) 



IiC,D;E) 
\ 

I{A;B\E) 



(9): 



HA; C) 
\ 

I{B;D,E\C) 



\ 

I{D;E) 

/ \ 

/ I{A;B\D) I{A;B\E) 



And by using equivalent forms of terms as was done in formulas (3d) through (17d), one can 
use Theorem|6]to prove formulas (6), (10), (19b), and (21b)-(24b) via the following diagrams: 
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(6): I{A;C) 



\ / 

I{C,D-E\C) 

/ \ 

I{A;B\C,D) I{B;D,E\E) 



I{A-C,D,E\D,E) ; 



(10): I{A-E) / I{C-D) 



\ / \ 

I{B;D,E\E) \ I{A;B\C) I{A;B\D) 

\ 

I{A;C,D\D,E) \ 



(19b): I{B;D) / I{D;E) 

/ \ \ / \ 

I{A-B,C\B) IiA-C,D\D) \ \ /(A;C|D) I{A;B,E\E) 

\ \ \ 

I{B,C;D,E\C,Dy\-^^ I{A;B,C\C) I{B, C; B, D\B, E) 
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(21b): l[B;D) \ 1{C;E) 

/ \ \ / \ 

HA;B,C\B) HA;C,D\D) HA;B,C\C) I(A;D,E\E) 

\ \ 

I{B,C;C,E\C,D) I{B, D;C, E\D, E) ' 

\ ^-^^^y yy 

I{A;B,C\C,E) 



(22b): 1(3; E) 

.---X / \ 

I{C,D) \ HA;B,C\B) HA;C\E) 

,'' / \'--— . \ 

1(3; E) \ IiA;B,C\C) I{A;B,D\D) " - ^ ^ I{A;B,C\C) 

/ \ - \ 

[ I(A;B,C\B) I{A;D,E\E) ''-^^ I(B,C-,B,E\B,D) } 

I{B,E;C,D\D,E) ; 



(23b): I{B;E) \^ I{D;E) 

/ \ \ / 

Ii,A-,B,C\B) I{A-,C,E\E) \, 1(A-,B,D\D) I(A;C,E\E) 

\ \ \ \ 

I{B, C; C, D\C, E) ^^ I{B, C; B, E\B, D) I{B, C; D, E\C, E) 

\ 

/(A;S,C|C,D)\ 
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(24b): I{A-E) 



I{B-D) 




I{A,B;C,D\B) I{C,D;D,E\D) 



/ \ 



\ 



I{A,B;B,D\B,E) I{A, B; C, D\C) 



I{A,E;B,D\D,E) 



\ 



I{A,B;C,D\B,D) 



One can also get a new extended version of Theorem |4] in the same way, though it is harder 
to state precisely. It is also slightly less flexible because it disallows reuse of the same variable or 
combination of variables; and the forest diagrams are easier to verify by inspection. 

Here are two more explicit proofs. 

Proof of inequality (18). Let Z he a common information of A and B, and let F be a common 
information of A and C; note that we have H(Y, Z\A) = 0. Then 



= I{Z; Y) + H{Z\D) + H{Y\D) + /(Z; D\E) 

+ I{Y-D\E) + I{Y,Z-E) 
> I{Z; Y) + H(Y\D) + I{Z; Z\E) + /(F; D\E) + /(F, Z; E) [from Lemma[I] 



= /(Z; Y) + H{Z\E) + H{Y\E) + /(F, Z- E) 

> I{Z- Y) + H{Y, Z\E) + /(F, Z- E) 

= I{Z-Y) + H{Y,Z) 

= H(Z) + H(Y) 

= I{A-B) + I{A- C). 



Proof of inequality (20). Let Z he a common information of A and B, and let F be a common 
information of A and C; note that we have H{Y, Z\A) = and H{C, Y\C) = H{C\C, F) = 0. 



I{B- C) + I{A- B\D) + I(A- C\D) + I(B- D\E) 

+ I{C;D\E) + I{A; E) 
> I{Z- Y) + /(F, Z- Z\D) + /(F, Z- Y\D) + /(Z; D\E) 

+ I{Y-D\E) + I{Y,Z-E) 



[from Corollary [2] 



> /(Z; F) + /(Z; Z\E) + /(F; Y\E) + /(F, Z; 



[from Lemma [T] 
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Then 



IiB;C) + IiB;D) + I{A;C\D) 

+ I{A; B\E) + I{A- E\B) + I{C- D\E) + I{B- E\C, D) 

> I{B- Y) + I{Z- D) + /(r, Z- C, Y\D) 

+ /(Z; Z\E) + /(y; + I{Y- D\E) + /(Z; ^|C, D) [from Corollary[2] 

= /(S; Y) + /(Z; D) + I{Y, Z; C, F |Z}) 

+ /(Z; z|£;) + /(y; + i{y- d\e) + /(Z; E\c, y, D) 

= /(S, E- Y) + /(Z; D) + /(y, Z; C, 

+ /(Z; Z|^) + /(y; D\E) + /(Z; y, D) 

> I{E- Y) + /(Z; D) + /(y, Z; C, y|D) 

+ /(Z; Z|£;) + I{Y- D\E) + /(Z; E\C, Y, D) 
= I{D, E; Y) + I{Z; D) + /(y, Z; C, y|D) 

+ /(Z;Z|^) + /(Z;^|C,y,D) 
= I{D, E- Y) + /(Z; D) + /(Z; C, y|Z}) + /(y; C, Y\D, Z) 

+ I{Z;Z\E) + I{Z;E\C,Y,D) 
= I{D, E- Y) + I{Z- D) + /(Z; C, y|Z}) + H{Y\D, Z) 

+ H{Z\E) + I{Z-E\C,Y,D) 
= I{D, E; Y) + I{Z; D) + /(Z; C, ^, y|D) + H{Y\D, Z) + //(ZjE) 

> I{D, E- Y) + /(Z; D) + /(Z; Y\D) + if(y|D, Z) + H{Z\E) 
= I{D, E- Y) + /(Z; D, y) + H{Y\D, Z) + if(Z|£;) 

= /(D, ^; y) + /(Z; D, ^) + /(Z; Y\D, E) + i/(y|D, Z) + H{Z\E) 

> I{D, E- Y) + /(Z; D, ^) + /(Z; y|Z}, 
+ iJ(y|D,£;,Z) + /7(Z|D,^) 

= /(D, ^; y) + /(Z; D, ^) + //(y|D, E) + i/(Z|D, ^) 
= /(D, ^; y) + /f(Z) + H{Y\D, E) 
= H{Z) + H{Y) 
= I{A-B) + I{A- C). 



It is not yet clear how to generalize these. 



4 Completeness 

The complete (and verified nonredundant) list of linear- variable inequalities on five variables con- 
sists of: 
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• the elemental Shannon inequalities: 

< I{A; B) 
< I{A-B\C) 
< I{A-B\C, D) 
< I{A-B\C,D,E) 
< H{A\B,C,D,E) 

and the inequalities obtained from these by permuting the five variables A, B,C, D, E (see 
Yeung [15] for a proof that these imply all other 5-variable Shannon inequalities); 

• the following instances of the Ingleton inequality: 

I{A; B) < I{A- B\C) + I{A- B\D) + /(C; D) (36) 

I{A- B) < I{A; B\C) + I{A; B\D, E) + /(C; D, E) (37) 

I{A- B, C) < I{A; B, C\D) + I{A; B, C\E) + I{D; E) (38) 

/(A B- A, C) < I{A, B- A, C\A, D) + /(A, B- A, C\A, E) + /(A, D- A, E) (39) 

and the ones obtained from these by permuting the five variables A, B,C, D, E (see Guille, 
Chan, and Grant flU for a proof that these imply all other 5-variable instances of the Ingleton 
inequality); and 

• inequalities (l)-(24) and their permuted- variable forms. 

To verify the completeness of this list, we consider the 3 1 -dimensional real space whose coor- 
dinates are labeled by the subsets of {A, B, C, D, E} in the usual binary order: 

{A}, {B}, {A S}, {C}, {A, C}, {S, C}, . . . , {A B, C, D, E}. 

Each of the listed inequalities, once it is rewritten in terms of the basic entropy terms 

H{A),H{B), H{A, B),H{C),H{A, C), . . . , H{A, B, C, D, E), (40) 

defines a half-space of this space; the intersection of these half-spaces is a polyhedral cone which 
can also be described as the convex hull of its extreme rays. If one of these extreme rays contains 
a nonzero point v which is (linearly) representable (i.e., there exist a vector space U and sub- 
spaces Ua, Ub, Uc, Ud, Ue of U such that dim([/A) = v{A), dim(?7ij) = v{B), dim(({7A, Ub)) = 
v{A, B), and so on), then this extreme ray can never be excluded by any as -yet- unknown linear 
rank inequality. If we verify that all of the extreme rays contain linearly representable points, then 
there can be no linear rank inequality which cuts down the polyhedral cone further, so the list of 
inequalities must be complete. 

There are 7943 extreme rays in R'^^ determined by the elemental Shannon inequalities and 
inequalities (l)-(24) and (|36l)-(l39l) (and permutations). If one considers two such rays to be es- 
sentially the same when one can be obtained from the other by a permutation of the five variables, 
then there are 162 essentially different extreme rays. A full list of the vectors generating these rays 
is available at: 



Page 17 of [31] 



Dougherty-Freiling-Zeger 



July 21, 2010 



http : / / zeger . us/linrank | 

The authors have shown that each of these vectors is representable over the field of real num- 
bers; in fact, up to a scalar multiple, this representation can be done using matrices with integer 
entries which actually represent the vector over any field (finite or infinite). For instance, consider 
the extreme ray given by the vector 

1121223122323332333233323332333 

(a list of 31 ranks or entropies in the order given by (l40l)). To this we associate the five matrices: 

Ma=[1 ] 

Ms = [ 1 ] 

Mc = [ 1 ] 

Md=[1 1 1 ] 



The interpretation here is that we have a fixed field F, and the row space of each of these matrices 
specifies a subspace of F^. The specified vector gives H{A) = 1, and the row space of Ma has 
dimension 1; the vector gives H{B) = 1, and the row space of Mb has dimension 1; the vector 
gives H(A, B) = 2, and the vector sum of the row spaces of Ma and Mb (i.e., the row space of 
M^-on-top-of-Ms) has dimension 2; and so on. Equivalently, if we take three random variables 
Xi,X2, Xs chosen uniformly and independently over the finite field F, and let A = xi, B = X2, 
C = Xs, D = xi + X2 + X3, and E = {xi + 2:2, X3), then the entropies of all combinations of 
A, B,C, D, E (with logarithms to base |F|) are as specified by the above vector. 

The dimensions of the row spaces listed above are easily computed over the real field (as ranks 
of the corresponding matrices). In order to verify that the same dimensions would be obtained 
over any field, one just has to note that, in each case where a matrix rank is computed to be k, 
there is actually a k x k submatrix whose determinant is ±1, so the selected k rows will still be 
independent even after being reduced modulo any prime. (Actually, it would suffice to verify that 
the greatest common divisor of the determinants of allk x k submatrices is 1 .) 

All of the other listed vectors turn out to be representable in the same way, except that for a 
few of them a scalar multiplier must be applied. For instance, consider the vector 

011112211222222112222222222222 2. 

To represent this, we would normally take Ma to be a x 2 matrix and Mb, Mc, Mb,, Me to be 
1x2 matrices whose unique rows have the property that any two are independent but any three are 
dependent. (In other words, these row vectors are a linear representation for the uniform matroid 
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^2,4-) For example, we could take 





II 






[1 






[0 


1] 


Md = 


[1 


1] 




[1 


2] 



over the real field, but these would not work over the field of two elements. In fact, no such 
choice of row vectors works over the field of two elements (the first two row vectors would be 
independent, but then the only choice for the third vector would be the sum of the first two, and 
the same would hold for the fourth vector, contradicting the independence of the third and fourth 
vectors). But if we instead take the vector 

022224422444444224444444444444 4, 

which is twice the preceding vector and hence determines the same extreme ray, then we can get 
suitable representing matrices 



Ma 
Mb 

Mc 

Md 

Me 



10 
10 

10 
1 

10 10 
10 1 

110 1 
110 



which work over any field. The same doubling is needed for 13 more of the 162 vectors; and one 
additional vector, the vector 

1121222122222221222222222222222 



corresponding to the uniform matroid f/2,5, had to be tripled in order to get a matrix representation 
that works over all fields. 



5 Methodology; testing representability of polymatroids 

The list of five-variable linear rank inequalities was produced by the following iterative process. 
Initially, we had the Shannon and Ingleton inequalities. At each stage, we took the current list 
of inequalities and used Komei Fukuda's cddlib software [7J to get the corresponding list of 
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extreme rays. We then examined the vectors generating the extreme rays to see whether they were 
representable (over the reals; we did not try to get representations working over all fields until after 
the iterative process was complete). When such a vector provably could not be represented, the 
proof (in each case we ran into here) yielded a new linear rank inequality provable via common 
informations; when we examined a vector where we had difficulty determining whether it was rep- 
resentable or not, we ran exhaustive tests on all ways of specifying a single common information 
(toward the end, we had to try a pair of common informations) to see whether I TIP could verify 
that the specified vector contradicted the Shannon inequalities together with the common informa- 
tion specification. Again each such verification led to a new linear rank inequality. (Of course, this 
is a highly sanitized version of the process as it actually occurred.) 

The testing of extreme rays for linear representability soon became a large task, so we gradually 
developed software to automatically find such representations in a number of cases (and we added 
more cases when we found new ways to represent vectors). This software used combinatorial 
rather than linear- algebra methods; for instance, the output of the program for the sample vector 

1121223122323332333233323332333 (41) 

used above was a specification of five vector spaces A, B,C, D, E which could be paraphrased as: 
"A is generated by one vector, B is generated by one vector not in A, C is generated by one vector 
not in A + B [the space spanned by A and B], D is generated by one vector in general position in 
A + B + C, and E is generated by two vectors, one in (A + B) D (C + D) and one in C." The 
development of the software involved recognizing as many cases as possible where one could find 
such a specification which could be met over the reals (or over any sufficiently large finite field) 
and would yield the desired rank vector. 

The (attempted) construction of a representation is done one basic subspace at a time: first the 
representation of A is constructed (this step is trivial), then the representation of B given A, then 
the representation of C given A and B, and so on. And each of these subspace representations is 
constructed one basis vector at a time. Given the representation of A, B, C, and D, the algorithm 
will determine how many basis vectors are needed for subspace E and successively try to choose 
them in suitable positions relative to the existing subspaces. At each step, a new vector will be 
chosen in general position in a subspace which is a sum of some of the already-handled subspaces 
A, B, C, D. (Here "general position" means in the selected subspace but not in any relevant proper 
subspace of it. Which subspaces are relevant depends on the current situation; we avoid having to 
determine this explicitly by just saying that the underlying field is sufficiently large, or infinite.) If 
there is a problem with specifying that the vector is in such a sum of basic subspaces, then we may 
have to specify that the vector is in the intersection of two sums of basic subspaces. 

Once the first vector is chosen, we take quotients of all of the existing spaces by this vector to 
get the new situation in which the second vector needs to be chosen. This is all done by counting 
dimensions, not by constructing actual numerical vectors. For instance, suppose the first vector is 
chosen to be in general position in subspace R which is a sum of basic subspaces from A, B, C, D 
(e.g., R = A + B). For each other sum subspace T, if the new vector is in T, then the quotient 
by the chosen vector will reduce the dimension of T by 1 ; if the chosen vector is not in T, then 
the quotient will not change the dimension of T. Since the vector is in general position in R, the 
vector will be in T if and only if i? C T, and to check whether R C T one simply has to see 
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whether dim(_R + T) = dimT. The case where the vector is chosen from an intersection of two 
sum subspaces R and S is more complicated; more on this below. 

Consider the example (|4T]) . Suppose that we have already constructed the representations for 
subspaces A, B, C, and D, and we are now ready to construct the representation for subspace E. 
The current situation can be summarized by the following two-row array: 

0112122312232333 
2221111011100000 ^ 

Here the first row is the ranks of sums from A, B,C, D in the order given by (|40l) . but starting 
with the empty space. For each of these sums, the second row gives the amount by which adding 
the new subspace E will increase the dimension of the sum. (So the second entry in this row is 
H{A + E)- H{A) = 3 - 1 = 2, the fourth entry is H{A + B + E) - H{A + 5) = 3 - 2 = 1, 
and so on.) 

From this array, we can see that, since E has dimension 2 but only increases the dimension of 
A + B hy 1, one of the nonzero vectors in E must he in A + B. So let us start by assuming that 
one of the vectors in is a vector chosen in general position in R = A + B. We can now check 
for all sums from A, B,C, D whether the sum will contain this chosen vector; this information is 
summarized in the row 

0001000100010111 (43) 

where 1 means the chosen vector is in the corresponding sum. To get the result of taking a quotient 
by (the subspace generated by) the chosen vector, we subtract (|43l) from the first row of (l42l) 
(because we have used up one vector from each of the indicated subspaces) and subtract the one's 
complement of (l43l) from the second row of (|42l) (because we have taken care of one of the new 
vectors for E beyond each of the indicated subspaces). So the situation after the first vector is 
chosen is given by: 

011112221222 2222 
111100000000-1000 

Of course, the negative entry in this array means that a problem has occurred: we tried to take 
a new vector not inC + D, but the given ranks require all vectors in to be in C + Z^. So we will 
try again; instead of taking a vector in general position in R = A + B, we take a vector in general 
position in Rn S, where S = C + D. 

This leaves the problem of determining, for each sum subspace T, whether the chosen vector is 
in T; as before, this is equivalent to determining whether i?n S* C T. This is not as straightforward 
as it was to determine whether R C T; in fact, there are situations where the given data on ranks 
of sum subspaces simply do not determine whether R n S C T. But we have identified many 
situations where the given data do allow this determination to be made. Here is a list; note that 
(a) each such test can also be applied with R and S interchanged, and (b) reading this list is not 
necessary for understanding the rest of the algorithm. 

• If RCT, then RnS CT. 
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• If the dimensions of R n S, R n T, and R n {S + T) are all equal, then RnS CT. [If 
two subspaces have the same (finite) dimension and one is included in the other, then the 
two subspaces are equal. Hence, v/e get R n S = R n {S + T) = RnT,soRnS = 
{RnS)n{RnT) = RnSnT,so RnS CT. Also, recall that the dimension of RnS 
can be determined from the given data; it is equal to I{R] S) = H{R) + H{S) — H(R, S).] 

• If the dimensions of RnT, S nT, {R + S) nT, and Rn S are all equal, then RnS CT. 
[We have RnT = {R + S)nT = S nT,so RnT = Rn S nT.Butnow dim(i? n S) = 
dim{R n T) = dim(i? n S nT), so Rn S = Rn S nT, so Rn S C T.] 

• If dim(i? nT) < dim{R n S), then Rn S ^ RnT,sowe must have RnS ^T. 

• Let Rn* S he the "nominal intersection" of R and S (i.e., the sum of the basic subspaces 
listed both in the sum R and the sum S). Clearly Rn* S C Rn S, so,if Rn* S ^ T, then 
RnS^T. 

• If dim{R nT) < dim(i? n {{R n* T) + S)) , then RnS ^T. [First note that, if U, V, W 
are subspaces such that V C U, then U n {V + W) = V + {U nW). (The right-to-left 
inclusion is easy. For the left-to-right inclusion, if u = v + w where u E U, v E V, and 
w eW, then u - v = w E U nW, so v + w E V + {U n W).) Hence, if i? fl 5* C T, then 
Rn{{Rn*T) + S) = {Rn*T) + {RnS) C RnT, sodim{Rn{{Rn*T) + S)) < dim(i?nr).] 

• IfT' CTandRnS C T' , then Rn S CT.lfT CT' and Rn S (^T , then RnS ^T. 

• Let S be the "nominal difference" of R and S (i.e., the sum of the basic subspaces listed 
in the sum i? but not in the sum 5), and let [/ = {R\* S) + {S\* R). lfdim{Un{Rn*S)) = 0, 
then 

RnS = {iR\* S)n (S \* R)) + {R n* S). 

[The right-to-left inclusion is easy. For the left-to-right inclusion, note that R = {R\* S) + 
lRn*S) andS" = {S\* R) + {Rn* S). Hence, ifx E i? n S", then we we have x = yi + zi = 
1/2 + Z2 for some yi E R\* S, y2 E S \* R, and zi, Z2 E Rn* S. Then y2 - yi = zi - Z2 
is in [/ n (i? n* S), so we have y2 = yi and Z2 = z^, hence, yx E {R \* 5") fl (5* \* i?) and 
X = yx\ zx'vs in the desired form.] Hence, if dim(?7 n {Rn* S)) = ^, Rn* S CT , and 
{{R \* S) n {S \* R)) C T, then RnS CT. 

These tests do suffice for the example here; the resulting membership vector is 

0001000100011111 

and the new array after taking a quotient by the first chosen vector is: 

0111122212221222 
1111000000000000 
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Let us call the new quotient spaces A', B', C, D', E' . The new ranks indicate that the remaining 
vector in E' must be chosen to be in C . If we take the new vector in general position in C", then 
the resulting membership vector is: 

0000111111111111 

(Note that we needed the chosen vector to be in D' as well as in C", but this turned out to be 
automatic, because the given ranks implied C = C + D' = D' .) And the result of taking a 
quotient by the second chosen vector is: 

0111011101110111 
0000000000000000 

The all-0 row means that the representation of E has been successfully completed. 

The current algorithm does not try many possibilities for the next vector to choose; it simply 
chooses one sum subspace (usually at the beginning of the list of available ones) to try to add a 
vector to, and, if that yields an immediate contradiction, perhaps tries one intersection of two sum 
subspaces. If any such step fails (either because of a contradiction, or because the algorithm cannot 
determine whether i? fl C T in some case), the algorithm gives up. However, the algorithm does 
give itself up to 120 chances by trying all permutations of the 5 basic variables. 

Each time a new extreme ray was produced, the above algorithm was applied as a positive test 
for representability, while tests against common informations were used as negative tests. If both 
sides failed, the ray was examined by hand. Sometimes this examination yielded a representation 
because we found a new way of determining whether i? fl S C T; if so, this new test was added to 
the algorithm. At the end, the algorithm was able to verify representability of 152 of the final 162 
extreme rays, leaving only 10 to be done by hand (by methods which did not fit in the framework 
of this algorithm). 

There are other possibilities for improving the algorithm that we have not yet implemented. 
One is doing a backtrack search to consider more possibilities for choosing vectors to add; another 
is to use the information on representation of previous subspaces in the construction of the repre- 
sentation of the current subspace. (In the preceding example, we used only the dimension data for 
A, B,C, D in the construction of the representation for E; we did not use the actual representa- 
tions constructed for A, B, C, D.) More ambitious would be to allow more options for choosing 
new vectors in terms of the known relations between the current subspaces. 

6 Six- variable inequalities (ongoing work) 

This iterative process for finding all linear rank inequalities is likely to be infeasible to complete for 
six or more variables. (Each cddl ib polytope computation in 3 1 dimensions took about 2-3 days; 
in 63 dimensions it would take far longer, as well as rapidly exceeding the memory available.) But 
we plan to continue the study, because we expect to find new phenomena at higher levels, possibly 
including extreme rays that are representable over some fields but not over others (hence yielding 
rank inequalities which hold only over those other fields), and inequalities which hold for ranks of 
vector spaces but are not provable via common informations. For instance, such situations could 
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come from the variables associated with the Fano and non-Fano networks in [4J, or the network 
in a. 

In order to make any progress at all, we had to take some shortcuts (since, as noted above, 
63-dimensional polytope computations were out of the question). One of these was to reduce the 
dimension of the search by assuming equality for one or more of the inequalities found so far; in 
effect, this is just concentrating on one face, corner, or intermediate-dimensional extreme part of 
the current region. Another was to work hard on trying to improve already-obtained inequalities, 
find additional instances of them, or strengthen them in multiple ways if they were not already 
faces of the region. 

We will show here some of the 6-variable inequalities we have found so far; a much longer list 
is available at: 

http : / /zeger . us/linrank 

All of these have been verified to be faces of the linear rank region (so they cannot be improved). To 
do this, we used a stockpile of linearly representable 6-variable polymatroids (the representability 
was proved by the algorithm described in the preceding section) encountered during the polytope 
computations. If a 6-variable linear rank inequality is satisfied with equality by 62 linearly in- 
dependent vectors from the stockpile, then it must give a face of the linear rank region. (The 
stockpile currently contains 3220 polymatroids, or 1846734 after one takes all instances obtained 
by permuting the six basic variables. It is also available at the above website.) 

First, there are the 6-variable elemental Shannon inequalities; there are 6 of these if one lists 
just one of each form, but 246 of them if all of the permuted-variable versions are counted. Then 
there are 12 instances of the Ingleton inequality (1470 counting permuted forms). Again, see 
Yeung [fTSl and Guille, Chan, and Grant [8J for the proof that these inequalities imply all of the 
other Shannon and Ingleton inequalities. 

Next come the instances of the 5-variable inequalities (l)-(24). The initial computation found 
183 of these instances that (with permuted forms) proved all of the others. However, 16 of these 
instances did not pass the face verification above and were later superseded by other 6-variable 
inequalities; this left 167 (61740 counting permuted forms) 5-variable instances which were faces 
of the 6-variable rank region. 

Finally, there are the true 6-variable inequalities. We have found 3490 of these so far (2395095 
counting permuted forms) which pass the face verification, along with several hundred more which 
do not pass and which we expect to be superseded later (though this is not guaranteed; perhaps our 
stockpile of representable polymatroids is insufficient, although the face test has been very reliable 
so far). We give some examples of these here; see the website mentioned above for the full list. 

Some inequalities follow directly from Theorem [3l such as: 

I{A; B) < I{A- C) + I{B- D\C) + I{A- E\D) + I{B- F\E) + I{A- B\F) (44) 

I{A- B) < I{A- C) + I{B- D\C) + I{A- E\D) + I{A- F\E) + I{A- B\F) (45) 

I{A- B) < I{A- C) + I{B- D\C) + I{E- F\D) + I{A- B\E) + I{A- B\F) (46) 

I{A- B) < I{A- C) + I{D- E\C) + I{A- B\D) + I{B- F\E) + I{A- B\F) (47) 

I{A- B) < I{C- D) + I{A- B\C) + I{E- F\D) + I{A- B\E) + I{A- B\F) (48) 
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And others follow directly from Theorem [6l such as: 

2I{A; B) < I{A- C) + I{D- E, F\C) + I{A; B\D) 

+ I{E-F) + I{A-B\E) + I{A-B\F) (49) 
2I{A- B) < I{A- C) + I{B- D\C) + I{A- E, F\D) 

+ I{E;F) + I{A;B\E) + I{A;B\F) (50) 
2I{A; B) < I{C; D) + I{A- B\C) + I{B- E, F\D) 

+ I{E-F) + I{A-B\E) + I{A-B\F) (51) 
2I{A- B) < I{C, D- E) + /(C; D) + I{A- F\C) 

+ I{A-B\F) + I{A-B\D) + I{A-B\E) (52) 
3/(A; B) < I{C, D- E, F) + I{C; D) + I{E- F) + I{A- B\C) 

+ I{A-B\D) + I{A-B\E) + I{A-B\F) (53) 

Then there are inequalities which follow from Theorem[3]or Theorem[6]using equivalent forms: 

I{A- B, C) < I{D- E) + /(C; F\D) + I{A- B\D, F) 

+ I{A; B\C, D) + I{A- C\B, F) + I{A- B, C\E) (54) 
I{A, B; C, D) < I{A- C, D) + I{B; E\A) + I{B- D\A, C, F) + /(D; F\A, E) 
+ I{B- C\A, E, F) + I{B- C\D, E) + I{A- D\B, C, F) 
+ I{A;C\B,E,F) + I{A;F\B,D,E) (55) 
2I{A; B) < I{D- F) + I{A- C) + I{B- D\C) + I{A- B\F) + I{A- E\D) 

+ I{A-F\C,D) + I{A-B\E) (56) 
I{A; B, C) < I{A; C) + I{B- D\C) + I{A- F\D) + I{A- B\F) + /(C; E\B, F) 

+ I{A-C\B,E) (57) 
3/(A 5; C, D, E) < I{A- C, F) + /(A, 5; D) + /(A, 5; ^) + /(C; + /(D; F|E) 

+ I{A- E\D, F) + I{B; C\A, D, F) + I{B; D\C, F) + I{A; D, E\B, C) 
+ I{A- D\B, C, E) + I{A- C\E, F) + I{B- D\A, E, F) + I{B- C, D\A) 
+ /(A, B- E\C, D) + I{B- E\A, C, D) + I{B- D\C, E, F) 
+ I{A,B;C\D,E) (58) 

All of the sharp inequalities found so far using one common information have been verified to 
be instances of Theorem [6l It seems quite possible that this theorem generates all one-common- 
information inequalities, but we have no proof of this. 

There are also hundreds of inequalities that required two common informations to prove. (In- 
equalities requiring more than two common informations are beyond the range of our software at 
present.) These are of two types. One type is those like inequalities (18) and (20) which have two 
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information terms on the left side and use the common informations corresponding to those terms: 

I{A; B) + I{A- C) < I{B- C) + I{A- D) + I{B- E\D) + /(C; F\D) 

+ I{A-B\E) + I{A-C\F) (59) 
2I{A- B, C) + I{B- C, D) < I{A; C, E) + I{A; F) + I{A; C\D) + 2I{A; B\C, F) 

+ I{B- C) + I{E- F\C) + 2I{B- D|C, E) + /(C; E\F) 

+ I{A- D\E, F) + I{D; E\A, C, F) + 2I{A; F\C, D, E) (60) 

The other type has just one information term on the left side but requires a second common infor- 
mation in addition to the one from the left term: 

I{A- B) < I{A; C) + liB; D\C) + /(^; F\D) + I{A; B\E) + /(A; C\F) 

+ I{B;E\C,F) (61) 
2I{A, B; C, D, E) < I{A, B; D, E) + /(A, D, F; C) + /(A, F; D\C) + I{B- C\D, E) 

+ I{A- C\B) + I{A- D\B, C, E) + 2I{A- C\D, E, F) + I{B; C\A, D, E) 
+ I{A- E\B, D, F) + I{B; E\A, C, F) + I{B; E\A, D, F) + I{B; E\C, D) 
+ I{B- D\A, E, F) + I{A; F\B, D, E) + I{A- F\B, C, D) (62) 
2I{A- B, C) < I{A; B) + I{D- E) + I{A- B\C) + /(C; E\B) + I{D- F\B, E) 

+ /(C; F\D) + I{A- B\C, D) + I{A- B, C\F) + I{A- C\E) (63) 

Inequality (|6T1) is proved using a common information for A and B along with a common in- 
formation for E and (D, F); inequality (|62|) is proved using a common information for (A, B) 
and (C, i?) along with a common information for {B, F) and (A, D, E); and inequality (l63l) is 
proved using a common information Z for A and (B, C) along with a common information for F 
and Z. (The possible need for such iteration of common informations along with joining of vari- 
ables makes it conceivable that an unbounded number of common informations could be needed 
to prove linear rank inequalities even on a fixed number of initial variables such as 6.) 

Since the inequalities in this paper have been proven using only common informations and the 
Shannon inequalities, they apply not only to linear ranks but also in any other situation where we 
have random variables which are known to have common informations. For instance, Chan notes 
in n] Definition 4] that abelian group characterizable random variables always have common in- 
formations (which are still abelian group characterizable random variables); hence, the inequalities 
proven here hold for such variables. 

7 An infinite list of linear rank inequalities 

The following theorem shows that there will be essentially new inequalities for each number of 
variables: 

Theorem 7. For any n > 2, the inequality 

n 

(n - 1)I{A; B) + H{CiC2 ■■■C^)<Y, ^(^' ^f, B, Q) (64) 

i=l 
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is a linear rank inequality on n + 2 variables which is not a consequence of instances of linear 
rank inequalities on fewer than n + 2 variables. 

Proof. First, it is not hard to show that (|64|) is equivalent to (l32l) . and we have already seen that 
(|32|) is a linear rank inequality (this can also be proved using Theorem[6l), so (l64l) is a linear rank 
inequality. 

In the following, if 5" = {ii, 22, • • • , ^fc} ^ {1,2,..., n}, we will write Cs for Ci^Ci^ ■ ■ ■ Ci^. 
Define a rank vector v on the subsets of {A, B, Ci, C2, ■ ■ ■ , C„} as follows: for any 5" C 
{l,2,...,n}, 

viCs) = 2\Sl 
v{ACs) = n+\S\, 
v{BCs) = min(2n - 2 + |^|, 2n), 
v{ABCs) = min(2n - 1 + 2n). 

One can easily check that v does not satisfy (|64l ). We will show that v does satisfy all instances (us- 
ing the variables A, B,Ci,C2, ■ ■ ■ , Cn) of all linear rank inequalities on fewer than n + 2 variables; 
this will imply that (|64l) is not a consequence of these instances, as desired. 

For this purpose, we construct rank vectors wa, wb,Wi,W2,..., Wn, each of which is the same 
as V except for one value. The changed values are: 

WAiA) = n-l, 
wb{B) = 2n-3, 
Wi{BCi) = 2n. 

We will show that each of these w vectors is linearly representable over any infinite or suffi- 
ciently large finite field F. In each case, the representation will use a vector space V over F of 
dimension 2n, with a basis xi,X2, ■ ■ ■ , Xn, yi, 2/2, ■ ■ ■ , Vn, and the variable Cj {1 < j < n) will be 
represented by the two-dimensional subspace {xj,yj). 

For the representations of A and B, instead of giving explicit formulas, it will be convenient 
to use the following concept. Suppose U is a nontrivial subspace of V. A point u G U is said 
to be in general position in U, relative to a given finite set S of points (if 5" is not specified, then 
we let S be the set of all points that have previously been mentioned explicitly), if u does not lie 
in any subspace U' of V spanned by a subset of S unless U' includes all of U. If the set 5* is of 
size bounded by A^, then the "in general position" condition excludes at most 2^ proper subspaces 
of U (including the trivial subspace), so there is no problem finding points in general position as 
long as the field size is greater than 2^. If we refer to multiple points being chosen in general 
position, then they should be considered as chosen successively, with later points being in general 
position relative to earlier points as well as the previous set S. This concept has been referred 
to by various terms; for instance, in in [13] such points are referred to as "freely placed". Points 
chosen in this way make it easy to compute augmented subspace dimensions: if u is in general 
position in U relative to 5" and U' is a subspace spanned by points in S, then dim({U' , u)) is equal 
to dim([/') + 1 unless U C U',in which case it is equal to dim(t/'). 
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For each i < n, a representation of Wi is obtained by assigning to A the space 



X — {xi,X2, . . . , Xn) 



and assigning to B the space spanned by all of the x vectors except Xi, together with n—1 additional 
points chosen in general position in V. 

For the representation of wb, we again assign to A the space X; B is assigned a space spanned 
by n — 2 points in general position in X together with n—1 additional points in general position 
iny. 

To represent wa, choose points zi, Z2, ■ ■ ■ , z^-x in general position in X, and assign to A and 
B the spaces (zi, ^2, • • • , ^n-i) and (2:1, 2:2, ... , Zn-2, j/i, 2/2, • • • , J/n), respectively. 

It remains to show that, if C(ti, . . . , t^) > is a linear rank inequality on k variables with 
k < n + 2, then no instance of this inequality fails for v. An instance of this inequality which 
applies to v is given by a map / from {ti, . . . , t^} to the subsets of {A, B,Ci, . . . , C„}. (Then 
the definition of / can be immediately extended to the subsets of {ti, . . . ,tk} by the formula 
/({tj^, . . . , = /(iji) U ■ ■ • U f{tj^).) So suppose we have an instance, given by C and / as 
above, which fails for v. Since C(ti, . . . , t^) > is a linear rank inequality, the instance must not 
fail for the representable vector wa- Therefore, the instance must use the value where v disagrees 
with Wa- This means that there is a subset of {ti, . . . , tk] which is mapped by / to {A}; it follows 
that there is some single value ja G {1, 2, . . . , A;} such that f{tj^) = {A}. Similarly, since the 
instance must not fail for wb, there is a subset of {ti, . . . , t^} which is mapped by / to {B}, so 
there exists Jb G {1, 2, . . . , A;} such that f{tjg) = {B}. And, for each i < n, the instance must 
not fail for Wi, so there is a subset of {ti, . . . , tk} which is mapped by / to {B, Ci}; hence, there 
exists ji G {1, 2, . . . , A;} such that f{tj-) is either {Ci} or {B , Ci} . It is clear from these / values 
that the numbers Jai Jb, jii j2i ■ ■ ■ ,jn are distinct; but this is impossible because {1, 2, . . . , A;} has 
fewer than n + 2 members. This contradiction completes the proof of the theorem. ■ 

8 Concurrent work and open questions 

During the preparation of this paper, the authors became aware of closely related concurrent work. 
Chan, Grant, and Kern [2J show nonconstructively that there exist linear rank inequalities not 
following from the Ingleton inequality. Kinser IfTTI presents a sequence of inequalities which can 
be written in the form 



for n > 4. (This is a variant of (l25l) which follows from Theorem HI the instance for n = 4 and 
n = 5 are permuted- variable forms of the Ingleton inequality and inequality (Ic), respectively.) 
Kinser shows that (|65l) is a linear rank inequality for each n > A and uses a method similar to 
the proof of Theorem |7] above to show that instance n of (|65] ) is not a consequence of linear rank 
inequalities on fewer than n variables. (The authors found the proof of Theorem |7] after the initial 
posting date of [1 IJ, but independently.) 

Here are some fundamental open questions that this research has not yet answered. 



n 




(65) 
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1) For each fixed n, are there finitely many linear rank inequalities on n variables which imply 
all of the others? 

2) Is the method of using common informations incomplete? That is, are there linear rank 
inequalities that cannot be proved from the basic technique of assuming the existence of common 
informations? 

The authors would like to thank James Oxley for helpful discussions. 
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